Goto

Collaborating Authors

 mathematical statistics


9e800ca1a4898f49a77fc0fcf7ec77e5-Paper-Conference.pdf

Neural Information Processing Systems

Here , denotes the Euclidean inner product,W is a window size,N( |v) a negative sampling distributionforvertexv andσ(y):=(1+e y) 1 thesigmoidfunction. Usually one takesℓP(y) = logσ(y)andℓN(y) = logσ( y).






Nash: Neural Adaptive Shrinkage for Structured High-Dimensional Regression

Denault, William R. P.

arXiv.org Machine Learning

Sparse linear regression is a fundamental tool in data analysis. However, traditional approaches often fall short when covariates exhibit structure or arise from heterogeneous sources. In biomedical applications, covariates may stem from distinct modalities or be structured according to an underlying graph. We introduce Neural Adaptive Shrinkage (Nash), a unified framework that integrates covariate-specific side information into sparse regression via neural networks. Nash adaptively modulates penalties on a per-covariate basis, learning to tailor regularization without cross-validation. We develop a variational inference algorithm for efficient training and establish connections to empirical Bayes regression. Experiments on real data demonstrate that Nash can improve accuracy and adaptability over existing methods.


Grace Wahba awarded the 2025 International Prize in Statistics

AIHub

The International Prize in Statistics Foundation has awarded Grace Wahba the 2025 prize for "her groundbreaking work on smoothing splines, which has transformed data analysis and machine learning". Professor Wahba was among the earliest to pioneer the use of nonparametric regression modeling. Recent advances in computing and availability of large data sets have further popularized these models, especially under the guise of machine learning algorithms such as gradient boosting and neural networks. Nevertheless, the use of smoothing splines remains a mainstay of nonparametric regression. In seminal research that began in the early 1970s, Wahba developed theoretical foundations and computational algorithms for fitting smoothing splines to noisy data.


Is Your Imitation Learning Policy Better than Mine? Policy Comparison with Near-Optimal Stopping

Snyder, David, Hancock, Asher James, Badithela, Apurva, Dixon, Emma, Miller, Patrick, Ambrus, Rares Andrei, Majumdar, Anirudha, Itkina, Masha, Nishimura, Haruki

arXiv.org Machine Learning

Imitation learning has enabled robots to perform complex, long-horizon tasks in challenging dexterous manipulation settings. As new methods are developed, they must be rigorously evaluated and compared against corresponding baselines through repeated evaluation trials. However, policy comparison is fundamentally constrained by a small feasible sample size (e.g., 10 or 50) due to significant human effort and limited inference throughput of policies. This paper proposes a novel statistical framework for rigorously comparing two policies in the small sample size regime. Prior work in statistical policy comparison relies on batch testing, which requires a fixed, pre-determined number of trials and lacks flexibility in adapting the sample size to the observed evaluation data. Furthermore, extending the test with additional trials risks inducing inadvertent p-hacking, undermining statistical assurances. In contrast, our proposed statistical test is sequential, allowing researchers to decide whether or not to run more trials based on intermediate results. This adaptively tailors the number of trials to the difficulty of the underlying comparison, saving significant time and effort without sacrificing probabilistic correctness. Extensive numerical simulation and real-world robot manipulation experiments show that our test achieves near-optimal stopping, letting researchers stop evaluation and make a decision in a near-minimal number of trials. Specifically, it reduces the number of evaluation trials by up to 40% as compared to state-of-the-art baselines, while preserving the probabilistic correctness and statistical power of the comparison. Moreover, our method is strongest in the most challenging comparison instances (requiring the most evaluation trials); in a multi-task comparison scenario, we save the evaluator more than 200 simulation rollouts.


Nonparametric independence tests in high-dimensional settings, with applications to the genetics of complex disease

Castro-Prado, Fernando

arXiv.org Machine Learning

[PhD thesis of FCP.] Nowadays, genetics studies large amounts of very diverse variables. Mathematical statistics has evolved in parallel to its applications, with much recent interest high-dimensional settings. In the genetics of human common disease, a number of relevant problems can be formulated as tests of independence. We show how defining adequate premetric structures on the support spaces of the genetic data allows for novel approaches to such testing. This yields a solid theoretical framework, which reflects the underlying biology, and allows for computationally-efficient implementations. For each problem, we provide mathematical results, simulations and the application to real data.


Asymptotic Gaussian Fluctuations of Eigenvectors in Spectral Clustering

Lebeau, Hugo, Chatelain, Florent, Couillet, Romain

arXiv.org Machine Learning

The performance of spectral clustering relies on the fluctuations of the entries of the eigenvectors of a similarity matrix, which has been left uncharacterized until now. In this letter, it is shown that the signal $+$ noise structure of a general spike random matrix model is transferred to the eigenvectors of the corresponding Gram kernel matrix and the fluctuations of their entries are Gaussian in the large-dimensional regime. This CLT-like result was the last missing piece to precisely predict the classification performance of spectral clustering. The proposed proof is very general and relies solely on the rotational invariance of the noise. Numerical experiments on synthetic and real data illustrate the universality of this phenomenon.